AITopics

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (0.93)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Ruppik, Benjamin Matthias, von Rohrscheidt, Julius, van Niekerk, Carel, Heck, Michael, Vukovic, Renato, Feng, Shutong, Lin, Hsien-chin, Lubis, Nurul, Rieck, Bastian, Zibrowius, Marcus, Gašić, Milica

Less is More: Local Intrinsic Dimensions of Contextual Language Models

arXiv.org Artificial IntelligenceOct-28-2025

Understanding the internal mechanisms of large language models (LLMs) remains a challenging and complex endeavor. Even fundamental questions, such as how fine-tuning affects model behavior, often require extensive empirical evaluation. In this paper, we introduce a novel perspective based on the geometric properties of contextual latent embeddings to study the effects of training and fine-tuning. To that end, we measure the local dimensions of a contextual language model's latent space and analyze their shifts during training and fine-tuning. We show that the local dimensions provide insights into the model's training dynamics and generalization ability. Specifically, the mean of the local dimensions predicts when the model's training capabilities are exhausted, as exemplified in a dialogue state tracking task, overfitting, as demonstrated in an emotion recognition task, and grokking, as illustrated with an arithmetic task. Furthermore, our experiments suggest a practical heuristic: reductions in the mean local dimension tend to accompany and predict subsequent performance gains. Through this exploration, we aim to provide practitioners with a deeper understanding of the implications of fine-tuning on embedding spaces, facilitating informed decisions when configuring models for specific applications. The results of this work contribute to the ongoing discourse on the interpretability, adaptability, and generalizability of LLMs by bridging the gap between intrinsic model mechanisms and geometric properties in the respective embeddings.

dimension, large language model, machine learning, (20 more...)

2506.01034

Country:

Asia (0.93)
Europe > Germany (0.46)
Europe > Austria (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

Extending Gossip Algorithms to Distributed Estimation of U-statistics

Neural Information Processing SystemsOct-2-2025, 04:00:46 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, gossip algorithm, node, (14 more...)

Country:

North America > United States > Massachusetts (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)

Neural Information Processing SystemsAug-14-2025, 11:45:28 GMT

437d9bde2999f6e3e854e09f250261a5-Paper-Conference.pdf

algorithm, cutset network, probability distribution, (15 more...)

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (0.93)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Chin, Khai Yi, Pinciroli, Carlo

Adaptive Self-Calibration for Minimalistic Collective Perception by Imperfect Robot Swarms

arXiv.org Artificial IntelligenceOct-28-2024

Collective perception is a fundamental problem in swarm robotics, often cast as best-of-$n$ decision-making. Past studies involve robots with perfect sensing or with small numbers of faulty robots. We previously addressed these limitations by proposing an algorithm, here referred to as Minimalistic Collective Perception (MCP) [arxiv:2209.12858], to reach correct decisions despite the entire swarm having severely damaged sensors. However, this algorithm assumes that sensor accuracy is known, which may be infeasible in reality. In this paper, we eliminate this assumption to (i) investigate the decline of estimation performance and (ii) introduce an Adaptive Sensor Degradation Filter (ASDF) to mitigate the decline. We combine the MCP algorithm and a hypothesis test to enable adaptive self-calibration of robots' assumed sensor accuracy. We validate our approach across several parameters of interest. Our findings show that estimation performance by a swarm with correctly known accuracy is superior to that by a swarm unaware of its accuracy. However, the ASDF drastically mitigates the damage, even reaching the performance levels of robots aware a priori of their correct accuracy.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

2410.21546

Country:

Oceania > Australia (0.04)
North America > United States > Michigan > Wayne County > Detroit (0.04)
North America > United States > Massachusetts > Worcester County > Worcester (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Zhou, Tianlong, Shang, Jun, Rao, Weixiong

Collaborative State Fusion in Partially Known Multi-agent Environments

arXiv.org Artificial IntelligenceOct-19-2024

In this paper, we study the collaborative state fusion problem in a multi-agent environment, where mobile agents collaborate to track movable targets. Due to the limited sensing range and potential errors of on-board sensors, it is necessary to aggregate individual observations to provide target state fusion for better target state estimation. Existing schemes do not perform well due to (1) impractical assumption of the fully known prior target state-space model and (2) observation outliers from individual sensors. To address the issues, we propose a two-stage collaborative fusion framework, namely \underline{L}earnable Weighted R\underline{o}bust \underline{F}usion (\textsf{LoF}). \textsf{LoF} combines a local state estimator (e.g., Kalman Filter) with a learnable weight generator to address the mismatch between the prior state-space model and underlying patterns of moving targets. Moreover, given observation outliers, we develop a time-series soft medoid(TSM) scheme to perform robust fusion. We evaluate \textsf{LoF} in a collaborative detection simulation environment with promising results. In an example setting with 4 agents and 2 targets, \textsf{LoF} leads to a 9.1\% higher fusion gain compared to the state-of-the-art.

artificial intelligence, fusion, information fusion, (18 more...)

2410.15137

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Zhang, Qiong, Chen, Jiahua

Byzantine-tolerant distributed learning of finite mixture models

arXiv.org Machine LearningJul-18-2024

This paper proposes two split-and-conquer (SC) learning estimators for finite mixture models that are tolerant to Byzantine failures. In SC learning, individual machines obtain local estimates, which are then transmitted to a central server for aggregation. During this communication, the server may receive malicious or incorrect information from some local machines, a scenario known as Byzantine failures. While SC learning approaches have been devised to mitigate Byzantine failures in statistical models with Euclidean parameters, developing Byzantine-tolerant methods for finite mixture models with non-Euclidean parameters requires a distinct strategy. Our proposed distance-based methods are hyperparameter tuning free, unlike existing methods, and are resilient to Byzantine failures while achieving high statistical efficiency. We validate the effectiveness of our methods both theoretically and empirically via experiments on simulated and real data from machine learning applications for digit recognition. The code for the experiment can be found at https://github.com/SarahQiong/RobustSCGMM.

byzantine failure, estimator, mixture model, (16 more...)

arXiv.org Machine Learning

2407.1398

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Neural Information Processing SystemsMar-12-2024, 22:28:10 GMT

Extending Gossip Algorithms to Distributed Estimation of U-Statistics

Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of U-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter. This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the U-statistic of interest. We establish convergence rate bounds of O(1/t) and O(log t/t) for the synchronous and asynchronous cases respectively, where t is the number of iterations, with explicit data and network dependent terms. Beyond favorable comparisons in terms of rate analysis, numerical experiments provide empirical evidence the proposed algorithms surpasses the previously introduced approach.

algorithm, artificial intelligence, machine learning, (16 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)

Lanegger, Christian, Oleynikova, Helen, Pantic, Michael, Ott, Lionel, Siegwart, Roland

To Fuse or Not to Fuse: Measuring Consistency in Multi-Sensor Fusion for Aerial Robots

arXiv.org Artificial IntelligenceDec-22-2023

Aerial vehicles are no longer limited to flying in open space: recent work has focused on aerial manipulation and up-close inspection. Such applications place stringent requirements on state estimation: the robot must combine state information from many sources, including onboard odometry and global positioning sensors. However, flying close to or in contact with structures is a degenerate case for many sensing modalities, and the robot's state estimation framework must intelligently choose which sensors are currently trustworthy. We evaluate a number of metrics to judge the reliability of sensing modalities in a multi-sensor fusion framework, then introduce a consensus-finding scheme that uses this metric to choose which sensors to fuse or not to fuse. Finally, we show that such a fusion framework is more robust and accurate than fusing all sensors all the time and demonstrate how such metrics can be informative in real-world experiments in indoor-outdoor flight and bridge inspection.

local estimate, metric, sensor, (16 more...)